Summarizing twitter posts regarding COVID-19 based on n-grams

نویسندگان

چکیده

The COVID-19 pandemic announced by the World Health Organization has disrupted human lives at different scales, including economy, public health, and people's emotions. Social media databases record huge accumulated information concern this pandemic. Twitter platform is considered one of most active social that enable users to tweet in conversations they are concerned about. problem arises when tweeters want search about a specific topic. They can only sort tweets its recency understand conversation not relevancy. This makes read through what was firstly discussed related Some strategies were developed for summarizing but topics still beginning. current research aims introduce technique present short summary with consuming little time effort. Thus, summarization task started clustering based on latent dirichlet allocation (LDA) method K-means then selected important sentences format summarization. study also compares bigram-based unigram-based Different metrics used evaluate results experiments each stage, output proposal system evaluated using ROUGE metrics.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ontology-based sentiment analysis of twitter posts

The emergence of Web 2.0 has drastically altered the way users perceive the Internet, by improving information sharing, collaboration and interoperability. Micro-blogging is one of the most popular Web 2.0 applications and related services, like Twitter, have evolved into a practical means for sharing opinions on almost all aspects of everyday life. Consequently, micro-blogging web sites have s...

متن کامل

Classification based approach for Summarizing Opinions in Blog Posts

With the growth of web, people are using it as a medium for expressing their opinions, thoughts through blog posts, reviews (in the form of ratings), and forums. Blogosphere is a place where people read, write their views and make comments on others views or thoughts there by exchanging information. It will be very difficult for any business, organization or individual to go through and underst...

متن کامل

On Automatic Plagiarism Detection Based on n-Grams Comparison

When automatic plagiarism detection is carried out considering a reference corpus, a suspicious text is compared to a set of original documents in order to relate the plagiarised text fragments to their potential source. One of the biggest difficulties in this task is to locate plagiarised fragments that have been modified (by rewording, insertion or deletion, for example) from the source text....

متن کامل

Author Profiling for Arabic Tweets based on n-grams

This paper presents an approach for author profiling of an unknown users from their texts produced in social media. In particular, we address the identification of two profile dimensions: gender and language variety, of Arabic twitter users based on their tweets. Our approach focused on applying metaclassification technique on features extracted from tweets body. We explored two main sets of fe...

متن کامل

Correcting Serial Grammatical Errors based on N-grams and Syntax

In this paper, we present a new method based on machine translation for correcting serial grammatical errors in a given sentence in learners’ writing. In our approach, translation models are generated to translate the input into a grammatical sentence. The method involves automatically learning two translation models that are based on Web-scale n-grams. The first model translates trigrams conta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Indonesian Journal of Electrical Engineering and Computer Science

سال: 2023

ISSN: ['2502-4752', '2502-4760']

DOI: https://doi.org/10.11591/ijeecs.v31.i2.pp1008-1015